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ABSTRACT. The purpose of this article is to present a general method to find 
limiting laws for some renormalized statistics on random permutations. The 
model of random permutations considered here is Ewens sampling model, which 
generalizes uniform random permutations. Under this model, we describe the 
asymptotic behavior of some statistics, including the number of occurrences of 
any dashed pattern. Our approach is based on the method of moments and relies 
on the following intuition: two events involving the images of different integers 
are almost independent. 

1. Introduction 

1.1. Background. Permutations are one of the most classical objects in enumer- 
ative combinatorics. Several statistics have been widely studied: total number of 
cycles, number of cycles of a given length, of descents, inversions, excedances or 
more recently, of occurrences of a given (generalized) pattern... A classical ques- 
tion in enumerative combinatorics consists in computing the (multivariate) gener- 
ating series of permutations with respect to some of these statistics. 

A probabilistic point of view on the topic raises other questions. Let us consider, 
for each N, a probability measure /z/v of permutations of size N. Then any statistic 
above can be interpreted as a sequence of random variables (Xn)n>i- The natural 
question is now: what is the asymptotic behavior (possibly after normalization) of 

The simplest model of random permutations is of course the uniform random 
permutations (for each N, [in is the uniform distribution on the symmetric group 
Sn). A generalization of this model has been introduced by W.J. Ewens in the 
context of population dynamics lfl6l . It is defined by 

0#(oO 

(1) ^ (M) = 6(9 + 1). ..(e + N-iy 

where 9 > is a fixed real parameter and #(cr) stands for the number of cycles 
of the permutation a. Of course, when 9 = 1, we recover the uniform distribu- 
tion. From now on, we will allow ourselves a small abuse of language and use the 
expression Ewens random permutation for a random permutation distributed with 
Ewens measure. 
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The purpose of this article is to introduce a new general approach to this family 
of problems, based on the method of moments. 

We then use it to determine the second-order fluctuations of a large family of 
statistics on permutations: occurrences of dashed patterns (Theorem ll.6l ). 

Random permutations, either with uniform or Ewens distribution, are well- 
studied objects. Giving a complete list of references is impossible. In Section 
11.51 we compare our results with the literature. 

1.2. Motivating examples. Let us begin by describing a few examples of results, 
which suggest a uniform and intuitive approach. 



Number of cycles of a given length p. Let Y p be the random variable given by 
the number of cycles of length p in an Ewens random permutation a in Sjy. The 
asymptotic distribution of Y p N ' has been studied by V.L. Goncharov lfl9l and V.F. 
Kolchin ll24l in the case of uniform measure and by G.A. Watterson [32, Theorem 
5] for the framework of a general Ewens distribution (see also [1 , Theorem 5.1]). 

Theorem 1.1 ([32]). Let p be a positive integer. When N tends to infinity, Y p N ^ 
converges in distribution towards a Poisson law of parameter /p. Moreover, the 

sequences of random variables {Y p i)N>lf or P' — P are asymptotically indepen- 
dent. 

Let us give an intuitive (but false) explanation of the first part of the result, 
assuming that some non-independent variables are independent. 

If list of pairwise distinct integers between 1 and N such that its 

minimum is i\ (there are (N) p /p such lists, where (N) p is the usual notation for 
the falling factorial (N) p = N(N - 1) . . . (N - p + 1)), we define 



(2) > 



j 1 if (ii . . . ip) is a cycle of a; 
{i x ,...,i p )\" ) I o otherwise. 



Each B C A . > is distributed according to a Bernoulli law of parameter 9/{N) p 

(see Lemma |3~TI ). These variables are not independent. Nevertheless the sum Y p N ^ 
of these (N) p /p Bernoulli variables of parameter 6 /{N) p converges in distribution 
towards a Poisson variable of parameter 9 /p. 



Excedances. A (weak) excedance of a permutation a in Sp? is an integer i such 
that a(i) > i. Let B^ x ' be the random variable defined by: 

„ex,AT, x Jo if <r(i) < i; 

I 1 if a[i) > i. 

When a is a Ewens random permutation, this random variable is distributed ac- 
cording to a Bernoulli law of parameter n % +q_ x (see Lemma [3TTb . 
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Let x be a fixed real number in [0; 1] and a a permutation of size N. When Nx 
is an integer, we define 



and we extend the function F& N ^ by linearity between the points i/N and (i+l)/N 
(for 1 < z < iV — 1). In sections [67T1 and |6T2l we explain why we are interested 
in this quantity: it is related to a statistical physics model, the symmetric simple 
exclusion process (SSEP), and to permutation tableaux, some combinatorial objects 
which have been intensively studied in the last few years. 
We show the following. 

Theorem 1.2. Let x be a real number between and 1 and a a random permuta- 
tion of size N, taken with Ewens distribution. Then, almost surely, 



lim ^(j 



1 - (1 -x) 2 



N^oo 2 

Moreover, if we define the rescaled fluctuations 

zi N \x) :=Vn(fW( x )- E (fW( x )) 



then, for any x\, . . . , x r , the vector (zi N \xi), . . . , zi N \x r )) converges towards 
a Gaussian vector (G(xi), . . . , G{x r )) of covariance matrix (K(xi,Xj))i<ij< r , 
for some explicit function K (see section\ 



If i ^ j, the variables B^ X ' N and Bj X ' N are not independent (their covariance 
is computed explicitly in section 16.41) . Nevertheless, the first-order limit and the 
Gaussian fluctuations of order N^ 1 / 2 correspond to what would happen with in- 
dependent variables (only the actual value of the covariance matrix K(xi,Xj) is 
different). 

With this formulation, Theorem 1 1.21 is new, but the first part is quite easy while 
the second is a consequence of lfT5l Appendix A] (see section ©. We also refer to 
an article of A. Barbour and S. Janson Q, where the case of the uniform measure 
is addressed with another method. 



Adjacencies. We consider here only uniform random permutations, that is the 
case 9 = 1. An adjacency of a permutation a in Sn is an integer i such that 
a(i+l) = <r(i)±l. As above, we introduce the random variable B* d ' N which takes 
value 1 if i is an adjacency and otherwise. Then B^' is distributed according 
to a Bernoulli law of parameter jj. An easy computation shows that they are not 
independent. 

We are interested in the total number of adjacencies in a, that is the random 
variable on S N defined by = Bf' N . 

Theorem 1.3 (El). AW converges in distribution towards a Poisson variable of 
parameter 2. 
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This result first appeared in papers of J. Wolfowitz and I. Kaplansky l34l l23l 
and was rediscovered recently in the context of genomics (see [35 ] and also ifTTl 
Theorem 10]). Note that it corresponds exactly to what would have been obtained 
if the variables B^' were independent. 

Of course, as the Bernoulli random variable considered in each of these exam- 
ples are not independent, the explanations given for these results are not rigorous 
proofs. Nevertheless, the considered events involve (most of the time) the images 
of different integers by the permutation a. Therefore, speaking informally, they are 
almost independent. The main lemma of this paper is a precise statement of this 
almost independence, that is an upper bound on joint cumulants. This result allows 
us to give new proofs of the three results presented above in a uniform way. 

1.3. The main lemma. From now on, N is a positive integer and a a random 
Ewens permutation in Sjy- 

If i and s are two integers in [N], we consider the Bernoulli variable which 
takes value 1 if and only if o~(i) = s. Despite its simple definition, this collection 
of events allows to reconstruct the permutation and thus generates the full algebra 
of observables (we call them elementary events). 

For random variables Xi,...,Xg on the same probability space, we denote 
k(Xi, . . . , Xf) their joint cumulant. Joint cumulants generalize the notion of co- 
variance (corresponding to I = 2). They somehow measure how dependent ran- 
dom variables are. Their definition is given in Section I2l2l 

Our main lemma is a bound on joint cumulants of products of elementary events. 
To state it, we introduce the following notations. Consider two lists of positive 
integers of the same length i = (ii,...,i r ) and s = (si, ... ,s r ) and define the 
graphs G*i(i, s) and Ga(i, s) as follows: 

• the vertex set of C?i(i, s) is [r] and j and h are linked in Gi(i,s) if and 
only if ij = ih and Sj = Sh- 

• the vertex set of G^i, s) is also [r] and j and h are linked in G2(i, s) if and 
only if {ij,Sj} n {i h , s h } / 0. 

The connected components of a graph G form a set partition of its vertex set that we 
denote CC(G). In particular, #(CC(G)) is the number of connected components 



Theorem 1.4 (main lemma). Fix a positive integer r. There exists a constant C r , 
depending on r, such that for any set partition t = (ti, . . . , r^) of [r], any N > 1 
and lists i = (ii, . . . , i r ) and s = (si, . . . , s r ) of integers in [N], one has: 



of G. 



(3) 




n€ n« 




Note that the integer CC(Gi(i, s))) is the number of different couples (ij, Sj). 
The second quantity involved in the theorem #( CC(G2(i, s)) V r) does not have 
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a similar interpretation. However, it admits an equivalent description. Consider 
the graph G' 2 , obtained from G2 by merging vertices corresponding to elements 
in the same part of r. Then #(CC(G2(i, s)) V r) is the number of connected 
components of G' 2 . 

As an example, let us consider the case where the entries in the lists i and s 
are pairwise distinct. In this case, the joint moment of products of is simply 
1/(N + 8 — l) a , where a is the number of factors (the case 9 = 1 is obvious, the 
general case is explained in Lemma l3~TI ). Joint cumulant can be expressed in terms 
of joint moments - see Eq. d8]) -, so the left-hand side of ([3]> can be written as an 
explicit rational function of degree — r. According to our main lemma, the sum has 
degree at most — I — r + 1, which means that many simplifications are happening 
(they are not at all trivial to explain!). This reflects the fact that the variables b\ N ^, 
are very weakly correlated. 

Remark 1 .5. It is worth noticing that our proof of the main lemma goes through a 
very general criterion for a family of sequences of random variables to have small 
cumulants: see Lemma |3T2] 

I. 4. Applications. Theorem 1 1.4 1 can be used to give new proofs of Theorems ll.il 

II. 21 and 11.31 Moreover, we get an extension of Theorem 11.31 to any value of the 
parameter 6. 

We must confess that our proofs of these results are quite technical. However, 
an important part of the difficulty is contained in the proof of Theorem 11.41 and 
hence must not be done again for each application. Moreover, these proofs are 
natural in the following sense: they are based on the idea that, when a is a uniform 
random permutation, and a(j) are almost independent. Besides, although the 
problems may seem quite different (in particular the limit law is not always the 
same), these proofs all follow roughly the same guidelines. 

To give more evidence that our approach is quite general, we study the number 
of occurrences of dashed patterns. This notion has been introduced^ in 2000 by E. 
Babson and E. Steingrimsson, because it gives a general setting which includes a 
lot of usual statistics of permutations ll3l . 

Thanks to our main lemma, we describe the second order asymptotics of the 
number of occurrences of any given dashed pattern in a random Ewens permuta- 
tion. 

Theorem 1.6. Let (t,X) be a dashed pattern of size p (see definition \7.3i and 
o~n a sequence of random permutations, each ctjv being of size N distributed with 

Ewens measure. We denote q = \X\. Then, T j^ P - q — , that is the renormalized 
number of occurrences of (r, X), tends almost surely towards l _ n . Besides, 



In the paper of Babson and Steingrimsson, they are called generalized patterns. But, as some 
more general generalized patterns have been introduced since (see next section), we prefer to use 
dashed patterns. 
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one has the following central limit theorem: 

where the arrow denotes a convergence in distribution and V T ,x is some nonnega- 
tive real number. 

This theorem is proved in section 1731 using Theorem 1 1.41 

Unfortunately, we are not able to show in general that the constant V T ,x is posi- 
tive (V Ti x = would mean that we have not chosen the good normalization). The 
following partial result has been proved by M. Bona (9j Propositions 1 and 2] (M. 
Bona works with the uniform distribution, but it should be not too hard to show 
that V T x does not depend on 6). 

Proposition 1.7. For any k > 1, r = Id^ and X = or X = [k — 1], the 

conjecture holds true. 

The proof relies on an expression of V T x as a signed sum of products of bino- 
mial coefficients. This expression can be extended to the general case and we have 
checked by computer the following conjecture for all patterns of size 8 or less. 

Conjecture 1.8. For any dashed pattern (r, X), one has V T ,x > 0. 

1.5. Comparison with other methods. There is a huge literature on random per- 
mutations. While we will not make a comprehensive survey of the subject, we shall 
try to present the existing methods and results related to our paper. 

Our Poisson convergence results have been obtained previously by the moment 
method in the articles l23l and ll32l . Our cumulant approach is not really different 
from these proofs. Yet, we have chosen to present these examples for two reasons: 

• first, it illustrates the fact that our approach can be used with different limit 
laws ; 

• second, the combinatorics is simpler in the Poisson cases, so they serve as 
toy model to explain the general structure of the proofs. 

Let us mention also the existence of a powerful method, called the Stein-Chen 
method, that proves Poisson convergence, together with precise bounds on total 
variation distances - see, e.g., J31 Chapter 4]. 

Let us now consider our normal approximation results. For uniform permuta- 
tions, both are already known or could be obtained easily with methods existing in 
the literature. 

• Theorem 1 1.2l has been proved by A. Barbour and S. Janson Q, who estab- 
lished a functional version of a combinatorial central limit theorem from 
Hoeffding |F2TI . This theorem deals with statistics of the form 

E «S% 

l<i,i<iV 

where is a sequence of deterministic N x N matrices. 
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• Theorem 11.61 has been proven for some particular patterns using depen- 
dency graphs and cumulants: see Theorems 10 and 17] and ll20l Section 
6]. The case of a general pattern (under uniform distribution) can be held 
with the same arguments. 

These methods are very different one from each other and none of them can be 
used to prove both results in a uniform way. Note also that they only work in the 
uniform case. Yet, going from the uniform model to a general Ewens distribution 
should be doable with a coupling argument, see below. 

Our method has the advantage to provide a uniform proof for both results and 
to extend directly to a general Ewens distribution. As it uses cumulants, our ap- 
proach is close to the dependency graph method. However, we deal with sum of 
pairwise dependent random variables and, to our knowledge, it is the first time that 
cumulants are used in this framework. We hope that this idea can be used on other 
objects than permutations, see next section. 

To be comprehensive, let us mention two other tools. The first one is the use 
of bivariate generating series, as illustrated in the book from P. Flajolet and R. 
Sedgewick - see lTT8l Examples IX.3, IX.4, IX.5, IX.9]. However, computing the 
bivariate generating series of permutations with respect to their size and the number 
of occurences of a given pattern is known to be a hard problem, so it is very unlikely 
that this method could be used to establish Theorem 1 1.61 

The second one is the use of couplings. A well-known coupling for random 
permutations is Feller coupling - see, e.g., [1, page 16] - that allows to prove The- 
orem 11.11 with bounds on total variation distances. There also exists the so-called 
Chinese restaurant process HI Example 2.4], which defines a coupling between 
Ewens random permutations and uniform random permutations. With this cou- 
pling, a Ewens random permuation differs from a uniform random permutation by 
0(2\9 — 1| log(n)) values. Therefore, it should be possible to deduce Theorem ll.2l 
and Theorem 11.61 for the general Ewens distribution from the case of the uniform 
distribution. 

1.6. Future work. In addition to the conjecture above, we mention three direc- 
tions for further research on the topic. 

The notion of dashed patterns has been further extended to the notion of gener- 
alized patterns in a recent paper of M. Bousquet-Melou, A. Claesson, M. Dukes 
and S. Kitaev iflOl Section 2]. Unfortunately, we have not been able to obtain a 
general result for the asymptotic number of occurrences of generalized patterns. 
Finding such a result is, in the author's opinion, a challenging open problem. One 
could even consider a more general framework, see section |7~4] 

Another direction consists in refining our convergence results (speed of conver- 
gence, large deviations, local limit laws) by following the same guideline. 

Finally, it is natural to wonder if the method can be extended to other family 
of objects. The extension to colored permutations should be straightforward. A 
promising direction is the following: consider a graph G with vertex set [n] and 
take some random subset S of its vertices, uniformly among all subset of size p 
(for some fixed number p). If p grows linearly with n, then the events "i lies in S" 
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(for 1 < i < n) have small joint cumulants (this is easy to see with the material of 
section [3). 

1.7. Outline of the paper. The paper is organized as follows. Section [2] presents 
the necessary material: set-partitions and cumulants. In section [3l we prove our 
main lemma. Then, in section HJ we give two easy lemmas on connected com- 
ponents of graphs, which appear in all our applications. The three last sections 
are devoted to the different applications: section [5] for the cycles, section [6] for the 
excedances and finally, section |7] for the generalized patterns (including the adja- 
cencies and the dashed patterns). 

2. Preliminaries : set partitions and joint cumulants 

2.1. Set partitions. The combinatorics of set partitions is central in the theory of 
cumulants (as explained below) and will be important in this article. 

A set partition of a set S is a (non-ordered) family of non-empty disjoint subsets 
of S (called parts of the partition), whose union is S. 

Denote V(S) the set of set partitions of a given set S. Then V(S) may be 
endowed with a natural partial order: the refinement order. We say that tt is finer 
than 7r' or tt' coarser than tt (and denote tt < tt') if every part of tt is included in a 
part of tt'. 

Endowed with this order, V(S) is a complete lattice, which means that each 
family F of set partitions admits a join (the finest set partition which is coarser 
than all set partitions in F, denoted with V) and a meet (the coarsest set partition 
which is finer than all set partitions in F, denoted with A). In particular, there 
is a maximal element {S} (the partition in only one part) and a minimal element 
{{x}, x G S} (the partition in singletons). 

Moreover, this lattice is ranked: the rank rk(7r) of a set partition tt is | S\ — #(vr), 
where #(vr) denotes the number of parts of tt. The rank is compatible with the 
lattice structure in the following sense: for all set partitions tt and tt', 

(4) rk(vr V tt') < rk(vr) + rk(vr'). 

Lastly, denote fi the Mobius function of the partition lattice V(S). In this paper, 
we only use evaluations of fx at pairs (tt, {S}) (that is the second argument is the 
maximum element of V(S)). In this case, the value of the Mobius function is given 
by: 

(5) / i(vr,{5}) = (-l)#W- 1 (#(vr)-l)!. 

2.2. Cumulants. We present in this section the definition and basic properties of 
joint cumulants. Most of this material can be found in Leonov's and Shiryaev's 
paper [25 ] (see also [22, Proposition 6.16]). 

Definition. 

They are defined as follows: if X\, . . . , are random variables on the same 
probability space (denote E the expectation on this space), then 

(6) k{X 1 , . . . , X t ) = [t x ■ ■ ■ U] In ME ( exp(t 1 X 1 + • • • + t t X t j) J . 
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As usual, [t\ . . . tg]F stands for the coefficient of t\ . . . ti in the series expansion 
of F in positive powers of t\,...,ti. Note that joint cumulants are multilinear 
functions. In the case where all the X\ are equal, we recover the ^-th cumulant 
Ke(X) of a single variable ifTTl . 

Joint cumulants can be expressed in terms of joint moments, and vice-versa. 
Denote [£} the set {1, ...,£}. 

(7) E(X 1 ---X e ) = H<Xi;i£C); 

(8) K (x l ,...,x e )= M(7r,{M» n E (n x • 

Trev([e]) c&tv \ieC J 

Recall that /j,(tt, {[£]}) has an explicit expression given by Equation (f5]). For ex- 
ample the joint cumulants of one or two variables are simply the mean of a single 
random variable (k(Xi) = E(X\)) and the covariance of a couple of random vari- 
ables (k(X 1 ,X 2 ) = E{XiX 2 ) - E(JTi)E(X 2 )). For three variables, one has 

k(X 1 ,X 2 ,X 3 ) = E(X 1 X 2 X 3 ) - E(X!X 2 )E(X 3 ) - E(X 1 X 3 )E(X 2 ) 

- E(X 2 X 3 )E(X!) + 2E(X 1 )E(X 2 )E(X 3 ). 
Cumulants of independent random variables. 

An interesting property of cumulants is the following: if the set of variables 
{Xi, 1 < i < £} can be split into two sets {Xi,i £ ^4} and {Xi,i G B} (with 
A U B = [£}) such that the variables from the first set are independent from the 
variables from the second, then 

K(X h ...,X e ) = [h...U]ln (^(exp^t^))) 

+ [*i...t / ]ln (E(exp(J2tiXi))) =0. 
^ ieB ' 

Because of this strong property, joint cumulants can be seen as a quantification of 
the dependence of random variables. 

Convergence in distribution using cumulants. 

Consider now m sequences of random variables: (X^) n >i for i G [m]. A 
consequence of Equations (O and ((8]) is that the convergence of all joint cumulants 

K(xfr\...,xMy,e> 1,1 <h,...,i t <m 

is equivalent to the convergence of all joint moments 

E (x^ ■ ■ ■ X&A ; £ > 1, 1 < h, . . . , n < m. 

In particular, if Y^ l \ . . . , y( m ) are random variables such that the law of the m- 
tuple (Y^\ . . . , y( m )) is entirely determined by its joint moments, then the two 
following statements are equivalent (see [6, Theorem 30.2] for the same property 
in terms of moments). 
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• For any t and any list i±, . . . , in [m], 

Urn « (xM, . . . , XM) = k {Y^\ YM) . 

• The sequence of m-tuples (Xn \ . . . ,X^ ) converges in distribution to- 
wards (Y^\...,Y^). 

Recall that Gaussian and Poisson variables are determined by their moments, see 
e.g. the criterion [0 Theorem 30.1]. Hence, cumulants can be used to prove con- 
vergence in distribution towards Gaussian or Poisson variables, such as the results 
of the previous section. 



3. Proof of the main lemma 

3.1. Joint moments. The first step of the proof consists in computing the joint 
moments of the family of random variables [B^ *p )l<i,s<N- 

Note that (flW)' = bJI while B™B$ = Oif a* s> and B^B™ = 
if i ^ i'. Therefore, we can restrict ourselves to the computation of the joint mo- 
ment E (b^^ • • • \ in the case where i = . . . ,i r ) and s = (s\, . . . , s r ) 

are two lists of pairwise distinct indices (some entry in the list i can be equal to an 
entry of s). 

We see these two lists as a partial permutation 




which sends ij to Sj. The notion of cycles of a permutation can be naturally ex- 
tended to partial permutations: , . . . , ij ) is a cycle of the partial permutation if 
s ji = iji > s h = *J3 an d so on untl l s 3~, = H\ ■ Note that a partial permutation does 
not necessarily have cycles. The number of cycles of a\ jS is denoted #(<7 iiS ). 

The computation of E f-Sj^i ' ' ' ^i^lr) renes on two important properties of 
the Ewens measure. First, it is conjugacy-invariant. Second, a random sampling 
can be obtained inductively by the following procedure (see, e.g. (TJ Example 
2.19]). 

Suppose that we have a permutation a of size N — 1 taken with this distribution. 
Write it as a product of cycles and apply the following transformation. 

• With probability 9/(N + 8 — 1), add N as a fixed point. More precisely, 
a' is defined by: 



a'(i) = <r(i) for i < N; 
a'(N) = N. 
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• For each j , with probability 1/ (TV + 9 — 1 ), add N j ust before j in its cycle. 
More precisely, a' is defined by: 

'a'(i) = a(i) for i ^ a- 1 (j),N; 

a'(N)=j; 
{o>(a-i(j)) = N. 

Then a' is a random permutation of Sn distributed with Ewens measure. Iterating 
this, one obtains a linear time and space algorithm to pick a random permutation 
distributed with Ewens measure. 

Let us come back now to the computation of joint moments. 

Lemma 3.1. Let a be a random permutation taken with Ewens distribution. Then 
one has 

E ( B ™ ■ ■ ■ ) = ^ . 

V ll ' Sl (N + 0-l)...(N + 0-r) 

For example, the parameter of the Bernoulli variables are given by 

(N+e-i 

Proof. As Ewens measure is constant on conjugacy classes of Sn, one can assume 
without loss of generality that i\ = N — r + 1, ii = N — r + 2, . . . , i r = N. 
Then permutations of Sn with a(ij) = sj are obtained in the previous algorithm 
as follows: 

• Choose any permutation in SV-r- 

• For 1 < j < r, add ij in the place given by the following rule: if Sj < ij, 
add ij just before Sj in its cycle. Otherwise, look at a- ljS (ij), <rf a (ij) and so 
on until you find an element smaller than ij and place ij before it. If there 
is no such element, then ij is a minimum of a cycle of <7 i s . In this case, 
put it in a new cycle. 

It is easy to check with the description of the construction of a permutation under 
Ewens measure that these choices of places happen with a probability 

0#(5i,s) 

□ 



(jv + 0- (N-r + ey 

3.2. A general criterion for small cumulants. Let a[ N \. . . be i sequences 
of random variables. We introduce the following notation for joint moments and 
cumulants of subsets of these variables: if A = {ji, . . . , j^} is a subset of [£], then 

<2 = E 



ntroduce thi 

A C 



We also introduce the auxiliary quantity f/jj^ implicitly defined by: for any subset 



5cA 
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Using Mobius inversion on the boolean lattice, we have explicitly: for any subset 

A C [£}, 

^2=nK?r )l51 

<5cA 

Lemma 3.2. Let , . . . , Ai be a list of sequences of random variables with 

normalized expectations, that is, for any N and j, E(A^) = 1. Then the follow- 
ing statements are equivalent: 

I. Quasi-factorization property: for any subset A C [£] of size at least 2, one 
has 

(9) [/f2 = l + 0(AH A l +1 ); 

// Small cumulant property: for any subset A C [£] of size at least 2, one has 

(10) K^l = 0(N-^). 

Proof. Let us consider the implication |7J 177| We denote = Uj^^ — 1 and 
assume that T^ V) = 0(N~^ +1 ) for any A C [£] of size at least 2. The goal is to 
prove that = 0{N~ l+1 ). Indeed, this corresponds to the case A = [£] of 1771 
but the same proof will work for any AC [£]. 

Recall the relation between moments and cumulants (Equation ([8])): 

TTEV([e]) CG7T 

But joint moments can be expressed in terms of T: 

*2g-np+*f>- £ 

Ai,...,A m 

|A|>2 

where the sum runs over all finite lists of pairwise distinct (but not necessarily 
disjoint) subsets of C of size at least 2 (in particular, the length m of the list is 
not fixed). When we multiply this over all blocks C of a set partition 7r, we obtain 

the sum of . . . over au usts °f pairwise distinct subsets of [£] of size at 
least 2 such that each \ is contained in a block of n. In other terms, for each 
i G [m], ir must be coarser than the partition n(Aj), which, by definition, has A, 
and singletons as blocks. Finally, 



nn K W _ V T {N) T {N) 

^ LL > K A,[E] - Ai • • • J A m 



V A,[£] 

pairwise distinct \foraIli, 7r>n(Aj) 

The condition on tt can be rewritten as 



^ M*, {[<]}) 



7r > n(Ax) v • • • v n(A TO ). 

Hence, by definition of the Mobius function, the sum in the parenthesis is equal to 
0, unless fl( Ai ) V • • • V n( A m ) = {[£]} (in other terms, unless the hypergraph with 
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edges (Aj)i<i< TO is connected). On the one hand, by Equation ((U), it may happen 
only if: 

m m 

£ rk (n(A,)) = £(|A<| - 1) > rk([fl) =1-1. 
i=l i=l 
On the other hand, one has 

iff ...iff = o(j\rS£i(i A *i- 1 ) 

Hence only summands of order of magnitude N~ e+1 or less survive and one has 

which is exactly what we wanted to prove. 

Let us now consider the converse statement. We proceed by induction on I and 
we assume that, for all Hi smaller than a given I > 2, the theorem holds. 

Consider some sequences of random variables A[ N \ A{ N ^ such that El 
holds. By induction hypothesis, one gets immediately that 

for all A C [£], ufj = 1 + 0(i\H A|+1 ). 

Note that an immediate induction shows that the joint moment fulfills 

for all A C [£], = 0(1) and (M^) -1 = 0(1). 

It remains to prove that 

<s= n( j ^) ( " i,w = i +°(^ 1 )- 

Thanks to the estimate above for joint moment, this can be rewritten as 

(12) Mi%= HiM^y-^+OiN-^). 

ACM 

Consider I sequences of random variables b[ N \. . . , such that the equality 

M^l = M^2 holds for A C [£} and such that Equation (O is fulfilled when 
A is replaced by B (the reader may wonder whether such a family B exists; let 
us temporarily ignore this problem, which will be addressed in Remark [373b . By 
definition, the family B of sequences of random variables fulfills condition |7| of 
the theorem and, hence, using the first part of the proof, has also property |771 In 
particular: 



But, by hypothesis, 



< M = 0(N- 



.r- T\A , 

A,[£] 1 B,[(\> 



As A and B have the same joint moment, except for M^jjj, and , this implies 
that 

M (7V) - M (N) - K {N) - K {N) - 0(N- e+1 
m a m 1v1 r r/i _ k a r/i k r w — <A iV 
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But the family B fulfills Equation ([T2l and, hence, so does family A. □ 

Remark 3.3. Let £ be a fixed integer and I a finite subset of (N>o) • Then, for 
any list (mi)i £ / of numbers, one can find some complex-valued random variables 
Xi,...,Xi such that 

E(XiK..X i /) = m iu ..., k . 
Indeed, one can look for a solution where X\ is uniform on a finite set {z%, . . . , z?} 
and Xj = Xf , where d is an integer bigger than all coordinates of all vectors in 
/. Then the quantities 

{T ■ E(Xj 1 . . . Xg l ), i G /} 

correspond to different power sums of z±, . . . , zr- Thus we have to find a family 
{z\, . . . , zt} of complex number with specified power sums until degree dP . This 
exists as soon as T > d? , because C is algebraically closed. In particular, the 
family B considered in the proof above exists. 

However, we do not really need that this family exists. Indeed, during the whole 
proof, we are doing manipulations on the sequences of moments and cumulants 
using only the relations between them (equations f7]) and ([8])). We never consider 
the underlying random variables. Therefore, everything could be done even if the 
random variables did not exist, as it is often done in umbral calculus [29]. 



3.3. Case with distinct indices. We consider here the case where all entries in 
the sequences i and s are distinct. To be in the situation of Lemma [3^21 we set, for 
h G \£] and N > 1: 



Ai N) = (N + 9-l) aj ilB^ Sj , 



where aj = \tj\. The normalization factor has been chosen such that ¥,(A^) = 1. 
Hence, we will be able to apply Lemma |3~T1 

Let us prove that a[ N \ . . . , Ag fulfills property |7| of this lemma. Of course, 
the case A = [£] is generic. The joint moments of the family A have in this case 
an explicit expression: for 5 Q [£], 



M {N) _ J^l 



Therefore, we have to prove that the quantity 

- n K?) ( - i)m = n ((^+^- i )E Je ^ J ) ( " )l4l+1 

«C[<] sew 

4|>2 

write as l + 0(iV^ +1 ). 

We proceed by induction over ae- If at = 0, for any 5 C [£ — 1], the factors 
corresponding to 5 and 5 U {£} cancel each other. Thus Q ai ,....a e _ 1 ,o = 1 an d the 
statement holds. 
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If ae > 0, the quantity Q ai ,..., ae can be written as 

Q ai ,..., ae = Q ai ,..., ai -i ■ n i n + e - 1 - E ' 

Setting X = iV + 9 — 1 — at, the second factor becomes 

i?a ll ...,a,_ 1 (X) := II U-E' 

«$c[*-i] \ je5 

We will prove below (Lemma|374]> that i? 0lj ...,^_ 1 (X) = 1 + 0(X- e+1 ). Besides, 
the induction hypothesis implies that Q ai ,...,a e -i = 1 + 0(N~ e+1 ) and hence 

Q ai ,...,a t = l + 0(N- £+1 ) 

Finally, the family A{ N) , . . . , vlf of sequences of random variables has the quasi- 



factorisation property of Lemma [3721 Thus it also has the small cumulant property 
and in particular 

K (A[ N \...,A^) = 0(N-^). 
Using the definition of the A h , this can be rewritten: 

( N ) TT pW I _ nfAr-r-t+io 



which is Theorem \L4\ in the case of distinct indices. □ 

Here is the technical lemma that we left behind in the proof: 
Lemma 3.4. For any positive integers a\ , . . . , a£_i, 

n =i+o(x-^y 

5C[£-1] \ jes J 
Proof. Define R ev (resp. R dd) as 

Il(*-E' 

where the product runs over subsets of [£ — 1] of even (resp. odd) size. Expanding 
the product, one gets 

^v=E E E (-l) m a n ...a Jm X^ 2 - m . 

m>0 <5i,...,5 m jie8 1 ,...,j m e5 m 

The index set of the second summation symbol is the set of lists of m distinct (but 
not necessarily disjoint) subsets of [£ — 1] of even size. Of course, a similar formula 
with subsets of odd size holds for i2 dd- 
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Let us fix an integer m < I — 1 and a list ji, ■ ■ ■ ,j m - Denote jo the smallest 
integer in [£—1] different form j\, . . . , j m (as m < £— 1, such an integer necessarily 
exists). Then one has a bijection: 



lists of subsets 
5i , . . . , 8 m of even size such 
that, for all h < m,jh G 6h 



lists of subsets 
Si , . . . , S m of odd size such 
that, for all h < m, jh € 5^ 



(£i,...,5 m ) h->. (5iV{jo},...,5 m V{j }), 
where V is the symmetric difference operator. This bijection implies that the sum- 
mand (— l) m a J - 1 . . . a-j m X ~ m appears as many times in R ev than in R dd- Fi- 
nally, all terms corresponding to values of m smaller than t — 1 cancel in the 
difference R ev — R dd and one has 

Remark 3.5. Thanks to a result of Leonov and Shiryaev that expresses cumulants 
of products of random variables as product of cumulants (see ll25l or QUI Theorem 
4.4]), it would have been enough to prove our result for a\ = ■ ■ ■ = ai = 1. But, 
as our proof uses an induction on ae, we have not done this choice. 

Remark 3.6. We would like to point out the fact that our result is closely related to 
a result of P. Sniady. Indeed, thanks to our multiplicative criterion to have small cu- 
mulants, the computation in this section is equivalent to Lemma 4.8 of paper QUI . 
However, Sniady's proof relies on a non trivial theory of cumulants of observables 
of Young diagrams. Therefore, it seems to us that it is worth giving an alternative 
argument. 

3.4. General case. 

L*A[ N \...,AW be some sequences of random variables. 
We introduce some truncated cumulants: if ttq, tti, tt2 and so on, are set partitions 
of [£] , we set 



7T>7TQ 



fcfVo; *i, *a. • • • ) = E a*MM» II m ac 



7r>7TQ 
TT^TTj ,7T 2 ,... 

In the context of Lemma [3T2l it is also possible to bound the truncated cumulants. 

Lemma 3.7. Let A^\...,A[ N ^ be some sequences of random variables as in 
Lemma U^2\ fulfilling property\I\(or equivalently propertyUTl). 

• I/ttq is a set partition of [£}, 

fcW(7T ) = O(iV-#M+ 1 ). 

• More generally, if ttq\ tti, 7T2, • ■ • are set partitions of [£], 

fcSVo; ^2,-..) = (iv-#(-o^ v ^-)+ 1 ). 
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Proof. For the first statement, the proof is similar to the one of |7|=^|77| of Lemma 
21 One can write an analogue of equation (fTTT) : 

/ \ 

fc?Vo)= E < } E mmm» 



Ai A m 

pairwise distinct 



\ t>(^0 VTr(Aj) V... ) / 



The same argument as above says that only terms corresponding to lists such that 
7To V vr(Ai) V • • • = [£] survives. Such lists fulfills 

m 

E |Ai| - 1 > rk(^) - rk(7r ) = #(tt ) - 1. 

i=l 

The first item of the Lemma follows because, by hypothesis, 

TW...r^ = o(Ar-^M)). 

For the second statement, we use an inclusion/exclusion: 



£(-1)'** fvrovfVvr, 
raw V Vie/ 



fc^(7r ;7ri, ...,7r h ) = 

jc. 

Then the second item follows from the first. □ 

Let us come back to the proof of Theorem 1 1.41 We fix two lists i and s of length 
r, as well as a set partition r of r. We want to find a bound for 

n<t n<)- e n«(iKnV 

X ' TV>T 

We split the sum according to the values of the partitions 7Ti = n A CC(Gi(i, s)) 
and 7T2 = 7T A CC^G^i, s)). More precisely, 



vjeri j'Gr £ / n <CO(G! (i,s)) 

7 7r 2 <CC(G 2 (i,s)) 



7T2 ' 



where 



t>t CSTT \iGC / 

7TACC(G 1 (i,s))=7T 1 

7rACC(G 2 (i,s))=7r 2 

We call the summation index the slice determined by tti and 7r 2 . 

Let us fix some partitions m and 7T2- For each block C of tti, we consider some 
sequence of random variables (^Lj^)jv>i such that: for each list of distinct blocks 
Ci, ...,Ch 

E(A {N) ■ ■ ■ A {N) ) = - 

A c h ( N + 0_ 1 ^ N + e _ 2 )...(N + e-h)' 
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For readers which wonder whether such variables exist, we refer to Remark 
which remains valid here. Consider the family 



(13) ( (N + 9 - l)A{f ) 

Its joint moment are the same than the ones of the in the previous section. It 
has been proven that such a family has the quasi-factorization property and, hence, 
its cumulants and truncated cumulants are small (Lemma [ 

But, if 7r is in the slice determined by tt\ and TT2, one can check easily (see 
the description of joint moments in section 13.11 ) that the corresponding product of 
moment is given by: 



c&tv \iec ) cgtt 



( \ 

n a o 

\C'CC j 



(TV) 



where a 7rij ^ 2 depends only on tt\ and 7r 2 and is given by: 

• if 7T2 contains in the same block two indices j and h such that ij = ih but 
Sj / s h or sj = s h but ij ^ i h ; 

• # 7 otherwise, where 7 is the number of cycles of the partial permutation 
(i, s), whose indices are all contained in the same block of 7T2. 

As a consequence, 



(14) 



y(N) ^T1£T2 \^ TT E 

**1,*2 ( N + Q_l)#(n 1 ) 11 



H (N + 6-1) A c 



T ACC(G 1 (l,s))=7r 1 \ c / cc / 

But the condition n A CC{G\{i, s)) = 7Ti can be rewritten as follows: it > tt\ 
and 7T ^ 7r' for any tt\ < it' < CC(Gi(i,s)). A similar rewriting can be per- 
formed for the condition it A CC^G^i, s)) = 7T2. Finally, the sum in equation 

(TT4l above is a truncated cumulant of the family ([TBI and is bounded from above 
by o(AT-|cc(G 2 (i,s))vr|+i^ TMs implies 

Yffi = 0(iY-#( 7r l)-| C ' C '( G '2(i,s))Vr| + l^ 

which ends the proof of Theorem [L4] because tt\ has necessarily at least as many 
parts as CC(Gi(i,s)). ' □ 

Remark 3.8. So far, we have considered the lists i and s as fixed. Therefore, the 
constant hidden in the Landau symbol O may depend of these lists. However, the 
quantity for which we establish an upper bound depends only on the partition r and 
on which entries of the lists i and s coincide. For a fixed r, the number of partitions 
and of possible equalities is finite. Therefore, we can choose a constant depending 
only on r, as it is done in the statement of Theorem 1 1.41 
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FIGURE 1 . An example of a graph, its contraction and surcontraction. 



4. Graph-theoretical lemmas 

In this section, we present two quite easy lemmas on the number of connected 
components on contractions of graphs. These lemmas will be useful in the next 
sections for applications of Theorem 1 1.41 

4.1. Notations. Let us consider a graph G with vertex set V and edge set E. By 
definition, if V' is a subset of V, the graph G[V] induced by G on V' has vertex set 
V' and edge set .E[V'], where E[V'] is the subset of E consisting of edges having 
both their extremities in V' . 

Let us consider a surjective map / from V to another set W. Then the contrac- 
tion of G by / is the graph Gj f with vertex set W and which has an edge between 
w and w' if, in G, there is at least one edge between a vertex of f^ 1 (w) and a 
vertex of f^ 1 (w'). 

Example. Consider the graph G of figure Q] Its vertex set is the 10-element 
set V = {1, 2, 3, 4, 5, 1, 2, 3, 4, 5}. Consider the application / from V to the set 
W = {1, 2, 3, 4, 5}, consisting in forgetting the bar (if any). The contracted graph 
G/f is drawn on the bottom left picture of Figure Q] 

4.2. Connected components of contractions. 

Lemma 4.1. Let G be a graph with vertex set V and f a surjective map from V to 
another set W. Then 

#(CC(G)) < #(CC(G//)) + Y, (#(CC(G[/- 1 H])) - 1). 



Proof. For each edge (w,w') in G/f, we choose arbitrarily an edge (v,v r ) in G 
such that f(v) = w and f(v') = w' (by definition of G/f, such an edge exists but 
is not necessarily unique). Thereby, to each edge of G/f or of G[f~ 1 (w)] (for any 
w in W) corresponds canonically an edge in G. 

Take covering forests Fqij and (F w ) wG w of graphs G/f and G[f^ 1 (w)] for 
w G W. With the remark above, to each covering forest corresponds a set of edges 
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in G. Consider the union F of these sets. It is an acyclic set of edges of G. Indeed, 
if it contained a cycle, it must be contained in one of the fibers f~ l (w), otherwise 
it would induce a cycle in F G /j. But, in this case, all edges of the cycles belong to 
F w , which is impossible, since F w is a forest. 
Finally, F is an acyclic set of edges in G and 

#(CC(G)) < \V\ - \F\ = \W\ - \F G/f \ + (\rHW)\ - 1 - \F W \) 

< #(CC(G//)) + (#(CC(G[/- 1 (u;)])) - 1). □ 

Continuing the example. All fibers f~ l (i) (for i = 1,2,3,4,5) are of size 2. 
Three of them contains one edge (for z = 3,4, 5) and hence are connected, while 
the other two have two connected components. Finally, the sum in the lemma is 
equal to 2, which is equal to the difference 

#(CC(G0) - #(CC(G//)) = 4-2 = 2. 

4.3. Fibers of size 2. In this section, we further assume that V = W U W and 
that / is the canonical application W U W — > W consisting in forgetting to which 
copy of W the element belongs. Throughout the paper, for simplicity of notations, 
we will use overlined letters for elements of the second copy of W. 

In this context, in addition to the contraction G/f, one can consider another 
graph with vertex set W. By definition, Gj / f has an edge between w and w' if, in 
G, there is an edge between w and w' and an edge between w and w'. We call this 
graph the surcontraction of G. 

Continuing the example. The graph G and the function / in the example above 
fit in the context described in this section. The surcontration G//f is drawn on 
Figured] (bottom right picture). 

Lemma 4.2. Let G and f be as above. Then 

#(CC(G)) < #(CC(G//)) + #(CC(G///)). 

Proof. Set Gx = G/f, G 2 = G//f and G 3 = G. 

By definition, an edge in G± between j and k corresponds to two edges in G3. 
In contrast, an edge in G2 corresponds to at least one edge in G3. 

Consider a spanning forest Fx in Gx ■ As the set of edges of Gx is smaller than the 
one of G2, Fx can be completed into a spanning forest F2 of G2. We consider the 
subset F3 of edges of G3 obtained as follows: for each edge of Fx, we take the two 
corresponding edges in G3 and for each edge of Fz\Fi, we take the corresponding 
edge in G3 (if there is several corresponding edges, choose one arbitrarily). 

We will prove by contradiction that F3 is acyclic. Suppose that F3 contains a 
cycle C3. Each edge of C3 projects on an edge in F2 and thus the projection of C3 
is a list S = (ex, ■ ■ ■ ,eh) of consecutive edges in F2 (consecutive means that we 
can orient the edges such that, for each £ G [h], the end point of eg, is the starting 
point of ei + x, with the convention e^+i = ex). This list is not necessarily a cycle 
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because it can contain twice the same edges (either in the same direction or in 
different directions). Indeed, F3 contains some pairs of edges of the form 

({w, w'}, {w, w'}) 

which project on the same edge in G2. But as edges from these pairs have no ex- 
tremities in common, they can not appear consecutively in the cycle C3. Therefore, 
the same edge can not appear twice in a row in the list S. This implies that the list 
S contains a cycle G2 as a factor. We have reached a contradiction as the edges in 
C2 are edges of the forest F2. Thus F3 is acyclic. 

The number of edges in F3 is clearly 2|i*\| + I-F2 \ ^i| = \F\ \ + |i*2|. Therefore 

#(CC(G 3 )) < 2\W\ - \F 3 \ = (\W\ - \Ft\) + (\W\ - \F 2 \) 

= #(CC(G 1 )) + #(CC(G 2 )). □ 

5. Toy example: number of cycles of a given length p 

In this section, we are interested in the number T p N ^ of cycles of length p in a 

random Ewens permutation of size N. The asymptotic behavior of Fp is easy to 
determine (see Theorem ll.il ). as its generating series is explicit and quite simple. 
We will give another proof which relies on Theorem ll.4l and does not use an explicit 
expression for the generating series of Tp . 

The main steps of the proof are the same in the other examples, so let us empha- 
size them here. 

Step 1: expand the cumulants of the considered statistic. 

In this step, one has to express the statistic we are interested in in terms of the 
variables b\ N J: here, 



r (A0 _ V B C ' N 

l<il<i2,id,,---,ip<N 



where B C A N . « = Bf) . . . B^ ■ B^ N ] is defined by equation ©. Therefore, 
one has 

<*)\ = sr ,( R w R (f> ... R w rW; 



»f <* 2 ,» 3 i 



Step 2: Give an upper bound for the elementary cumulants. 

Now, we would like to apply our main lemma to every summand of equation 
(fT3T >. For this, one has to understand what is the exponent of N in the upper bound 
given by Theorem 1 1.41 

For a matrix 

(£) i<j< P , 

J l<r<£ 

we denote: 
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• Mil) = \{{i r p i r j+l );l < j < p, 1 < r < £}\ the number of different 
entries in the matrix of couples (i r j,i T j +1 ) (by convention, i r p+l = 

• Q(i) the number of connected components of the graph G(i) on [£] where 
r\ is linked with T2 if 

{i J ri ;l<i<p}n{z J r2 ;l<j<p}^0; 

• t(i) the number of distinct entries. 

Clearly, M(i) is always at least equal to t(i). In the case where r has I blocks of 
size p and where the list s is obtained by a cyclic rotation of the list i in each block, 
Theorem 11.41 writes as: 



(16) \ K (Bf{ ...B^{,--- ,Bf: )| < C pl N- M ®-QV+l 

< C p£ N~ M ^ < C pe N- t{i \ 
Step 3: give an upper bound for the number of lists. 

As the number of summands in Equation (031 ) depends on N, we can not use 
directly inequality ([TBI . We need a bound on the number of matrices i with a given 
value of M(i). 

This bound comes from the following simple lemma: 

Lemma 5.1. For each L > 1, there exists a constant C' L with the following prop- 
erty. For any N > 1 and t € [L], the number of lists i of length L with entries in 
[N] such that 

. .,i L }\ = t 

is bounded from above by C^N 1 . 

Proof. If we specify which indices correspond to entries with the same values (that 
is a set partition of the set of indices), the number of corresponding lists is (^) and 
hence is bounded from above by N t . This implies the lemma, with C' L being equal 
to the number of set partitions of [L]. □ 

Step 4: conclude. 

By inequality (fT6l ) and Lemma I5TT1 for each t G [p ■ £], the contribution of lists 
(ij) taking exactly t different values is bounded from above by C' pi C p e and hence 

foralH > l,K<(rf) = O(l). 

To compute the component of order 1, let us make the following remark: by the 
argument above, the total contribution of lists (ij) with M(i) > i(i) or Q(i) > 1 
isOiN- 1 ). 

But M (i) = implies that, as soon as 

{^ 1 ;i<i<p}n{^ 2 ;i<j<p}^0, 

the cyclic words (i^ 1 , . . . , i 7 ^ ) and (i^ 2 , . . . , i r p 2 ) are equals. As i\ is always the 
minimum of the i r j, the two words are in fact always equal in this case. In particular 
G(i) is a disjoint union of cliques. If we further assume Q(i) = 1, i.e. G(i) is 
connected, G(i) is the complete graph and we get that V- does not depend on r. 
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Finally 

(IV) *Cr^)= E ^«l-B%i)+0(N-i). 



«1<«2,«3, 

,(N) R (JV) 



But each B\J 2 . . . Bf^ is a Bernoulli variable of parameter 9/(N + 9 — l) p . 
Therefore their moments are all equal to 9/(N + 9 — l) p and by formula ©, their 
cumulants are 9/(N + 9 — l) p + 0(N~ 2p ). Finally, as there are (N) p /p terms in 
equation (fTTT ). 



^(rf ) ) = - + o(iv^ 
p 



which implies that F p converges in distribution towards a Poisson law of param- 
eter -. 

p 

Moreover, a simple adaptation of the proof of Equation ( U7T ) implies that 

K(r p »\...,T^) = 0(N-i) 
as soon as two of the jv's are different. Indeed, no matrices {i r -) i< r <i with rows 

J l<i<p r 

of different sizes fulfill simultaneously M(i) = t(i) and Q(i) = 1. Finally, for 
any p > 1, the vector (r^ ,. . . ,1"^) tends in distribution towards a vector 
(Pi, . . . , B p ) where the Bi are independent Poisson-distributed random variables 
with respective parameters 9/i. □ 
Remark. After equation (fTTT ). one could have finished the proof without compu- 
tation by the following argument: T p N ^ has asymptotically the same cumulants as 
a virtual variable Xn, which writes as a sum of independent random variables with 

the same distribution as the B c ,'. . As each B C A . N is a Bernoulli variable of 

(n,...,i p ) (ii,...,i p ) 

expectation 9/(N + 9 — \) p and as there are (N) p /p such variables, Xn converges 

in distribution towards a Poisson law of parameter 9 jp. And so does T p N \ 

As promised in the introduction, this argument follows the idea that everything 

c N 

happens as if the variables BA i were independent. 

6. Number of excedances 

In this section, we look at our second motivating problem, the number of ex- 
cedances in random permutations. The first two subsections make a link between 
a physical statistics model and this problem, justifying our work. The last two 
subsections are devoted to the proof of Theorem 11.21 and related results. 

6.1. Symmetric simple exclusion process. The symmetric simple exclusion pro- 
cess (SSEP for short) is a model of statistical physics: we consider particles on a 
discrete line with N sites. No two particles can be in the same site at the same 
moment. The system evolves as follows: 

• if its neighboring site is empty, a particle can jump to its left or its right 
with probability ; 

• if the left-most site is empty (resp. occupied), a particle can enter (resp. 
leave) by the left with probability (resp. jf^); 



24 



V. FERAY 



• if the right-most site is empty (resp. occupied), a particle can enter (resp. 
leave) by the right with probability jA^ (resp. jA^)', 

• otherwise (we suppose a, j3, 7, 6 < 1 such that, in a given state, the sum of 
the probabilities of the events which may occur is smaller than 1), nothing 
happens. 

Mathematically, this defines an irreducible aperiodic Markov chain on the finite set 
{0; 1} N (a state of the SSEP can be encoded as a word in and 1 of length N, 
where the entries with value 1 correspond to the positions of the occupied sites). 

This model is quite popular among physicists because, despite its simplicity, 
it exhibits interesting phenomenons like the existence of different phases. For a 
comprehensive introduction on the subject and a survey of results, see [14J. 

A good way to describe a state r of the SSEP is the function Fr N ^ defined as 
follows: when Nx is an integer, 

Nx 

**">(*>- IT 5> 

i=l 

and, for each i G [N], the function is affine between (i — l)/N and i/N. One 
should see as the integral of the density of particles in the system. 

We are interested in the steady state of the SSEP, that is the unique probability 
measure on {0; 1}^, which is invariant by the dynamics. More precisely, we 

want to study asymptotically the properties of the random function F^ , where r 
is distributed with \i # and N tends to infinity. 



6.2. Link with permutation tableaux and Ewens measure. From now on, we 
restrict to the case a, 7, 5 = 1. In this case, thanks to a result of S. Corteel and 
L. Williams lfl3l . the measure is related to some combinatorial objects, called 
permutation tableaux. 

The latter are fillings of Young diagrams (which can have empty rows, but no 
empty columns) with and 1 respecting some rules, the details of which will not be 
important here. The Young diagram is called the shape of the permutation tableau. 
The size of a permutation tableau is its number of rows, plus its number of columns 
(and not the number of boxes!). 

In addition with their link with statistical physics, permutation tableaux also ap- 
pear in algebraic geometry: they index the cells of some canonical decomposition 
of the totally positive part of the Grassmannian ll28l l33l . They have also been 
widely studied from a purely combinatorial point of view ll3Tl[T2l l2l. 

To a permutation tableau T of size N + 1, one can associate a word w T in 
{0; 1}^ as follows: we label the steps of the border of the tableau starting from the 
North-East corner to the South- West corner. The first step is always a South step. 
For the other steps, we set wf = 1 if and only if the i + 1-th step is a south step. 
Clearly, the word w T depends only on the shape of the tableau T. This procedure 
is illustrated on figure |2] 
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Figure 2. From permutation tableaux to words in {0; 1} . 

With this definition, the border of a tableau T of size N + 1 is the parametric 
broken line 

{ {n x (w T ) - NF$\x), -N(x - F™(x)) - l) : x € [0; 1]}, 
where n\(w T ) is the number of 1 in w T and F^P the function associated to the 

rp (TV) 

word w as defined in the previous section. Hence, F V T is a good way to encode 
the shape of the permutation tableau T. 

S. Corteel and L. Williams also introduced a statistics on permutation tableaux 
called number of unrestricted rows and denoted u(T). If (3 is a positive real param- 
eter, this statistics induces a measure fijf(p) on permutation tableaux of size N, for 
which the probability to pick a tableau T is proportional to j3~ ut - T \ This measure 
is related to the SSEP by the following result (which is in fact a particular case 
of H21 Theorem 3.1] but we do not know how to deal with the extra parameters 
there). 

Theorem 6.1. [13 ] The steady state of the SSEP fi^ is the push-forward by the 
application T i— > wt of the probability measure [aJj , i(/3). 

It turns out that this measure can also been described using random permuta- 
tions. Indeed, S. Corteel and P. Nadeau lPT2l Theorem 1 and Section 3] have exhib- 
ited a simple bijection <I> between permutations of N + 1 and permutation tableaux 
of size N + 1, which satisfies: 

• If a permutation a is mapped to a tableau T = $(cr), then: 

w T = (S 2 (o-),S 3 (a),... ,5 N+ i(a)), 
where 5i = 1 if i is an ascent, that is if a(i) < a(i + 1) (by convention 

8er(N+l) (?) = !)• 

• The number of unrestricted rows of a tableau T = $(cr) is the number of 
right- to-left minima of a: recall that i is a right- to-left minimum of a if 

oi> i for any £ > <r _1 (z). 

We are rather interested in the number of cycles of permutations rather than their 
number of right-to-left minima. The following bijection, which is a variant of the 
first fundamental transformations on permutation ll26l § 10.2], sends one of this 
statistics to the other. Take a permutation a, written in its cycle notation such that: 

• its cycles ends with their minimum; 

• the minima of the cycles are in increasing order. 
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For example, a = (3 5 1)(7 4 2) (6). Now, erase the parenthesis: we obtain the 
word notation of a permutation \l/(cr). 

The application ^ is a bijection from Sjy to Sjy. Besides, the minima of the 
cycles of a are the right-to-left minima of ^(<t), while the ascents in \£(<t) are the 
weak excedances in a, that is the integers % such that a(i) > i (a similar statement 
is given in ll26l Theorem 10.2.3]). 

From now on, we assume (3-0 = 1. The properties above imply that fJ^(f3) is 
the push-forward of the Ewens measure of parameter 9 by the application $ o 
Combining this with Theorem 16.11 the steady state of the SSEP //jv is the push- 
forward of Ewens measure by the application a h-> u>*'*^'. But this application 
admits an easy direct description 

S N+1 -»• {0;1} N 

a (^t(2)>2>^t(3)>3> • • • ^cr(Af+l)>A r +l)- 

Recall that, as explained above, we are interested in the random function F^ N ' , 
where r is distributed according to the measure /U/v-l- The results above imply that 
this random function has the same distribution than F& N+1 \ where a is a random 
permutation of size N distributed with Ewens measure of parameter 9 and F& N+1 ^ 
is the function defined in section 11.21 

6.3. Bounds for cumulants. Let us fix some real numbers x%, . .. , xg in [0; 1]. 
In this section, we will give some bounds on the joint cumulants of the random 
variables (F^ N) ( Xl ), . . .,F^ N \xg)). 

Let us begin by the following bound (step 2 of the proof, according to the divi- 
sion done in section [5]). 

Proposition 6.2. For any £ > 1, any N > 1 and any lists i%, . . . , i# and si,...,S£ 
of integers in [N], 

K (Bj N ] , . . . ,#f2 ) < Qjv-]{*i,-,^i,.-,^}]+i 
where Cg is the constant defined by Theorem \1.4\ 

Proof. Using Theorem 1 1.4 1 for r = {{1}, . . . , {(}}, we only have to prove that 

-#(CC(6?i(i,s))) - #(CC(G 2 (i,s))) > -\{h,. . .,i t , 8l , . . ., 8t }\. 

The last quantity \{ii, . . . , ig, si, . . . , sg}\ can be seen as the number of connected 
component of the graphs Gs(i, s) defined as follows: 

• its vertex set is [£} U [£] = {1, 1, . . . , £, £}; 

• there is an edge between j and k (resp. j and k, j and k) if and only if 
ij = i k (resp. ij = s k , Sj = s k ). 

The inequality above is simply Lemma l4~2l applied to the graph Gs(i, s) (Gi(i, s) 
and Gz(i, s) are respectively its surcontraction and contraction). □ 



We can now prove the following bound: 
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Proposition 6.3. There exists a constant C'J such that, for any integer N > 1 and 
real numbers x\, .. ., xg, one has 

\K(F( N \x l ),...,FW(x i ))\<C{N-e+ 1 . 

Proof. To simplify the notations, we suppose that Nx\, . . . , Nxe are integers, so 
that 

(iv-i).FW(^)=^r'>). 

But the Bernoulli variable B^ X ' N can be written as B^ X ' N = J2 s >i Bis - Finally* 

by multilinearity, one has (step 1): 

(18) 

(N-iy K (F^( Xl ),...,F^(x e ))= y: £ •><!)■ 

2<ii<Nxi si>ii 

We apply Lemma I5TT1 to the list i\, . . . , ig, si, . . . , sg and get that the number of 
couples of lists (i, s) such that . . . , ig, s%, . . . , se}\ is equal to a given number 
t is bounded from above by C' 2i TV* (step 3). 

Combining this with Proposition |6]2j we get that the total contribution of couples 
of lists (i, s) with \{ii, ■ ■ ■ , ig, si, . . . , s^}| = t to the right-hand side of (TT8T > is 
smaller than C' 2i C^N , which ends the proof of Proposition 16.3 1 (step 4). □ 

Illustration of the proof. Set i = 5 and consider the lists i = (5, 2, 2, 7, 7) and 
s = (8, 8, 2, 7, 7). The graph Gs(i, s) associated to this couple of sequences is the 
graph G drawn of Figured] It follows immediately that Gi(i, s) = G//f has 4 
connected components while G2(i, s) = G/ f has 2. Therefore, by Theorem 1 1.41 

The same bound is valid for all sequences i and s such that G^i, s) = G. There 
are fewer than N 4 such sequences: to construct such a sequence, one has to choose 
distinct values for the four connected components of G, such that they fulfill some 
inequalities. Finally, their total contribution to (TT8l is smaller than C^N^ 1 . 

Comparison with a result of B. Derrida, J.L. Lebowitz and E.R. Speer. In II 151 
Appendix A], it is proven a long range correlation phenomenon for the SSEP. 
Rewritten in terms of Ewens random permutations via the material of the previous 
section, it asserts that, for i\ < ■ ■ ■ < ig, 

K (B^ N ,...,B^ N ) = 0(N- e+1 ). 

In fact, their result is more general because it corresponds to the SSEP with all 
parameters. This bound on cumulants can be obtained easily using our Propo- 
sition 16.21 and Lemma 15.11 A slight generalization of it (taking into account the 
case where some i's can be equal) implies directly Proposition 16. 3 1 Therefore, our 
method does not give some new results on the SSEP. Nevertheless, it was natural 
to try to understand the long range correlation phenomenon directly in terms of 
random permutations and it is what our approach does. 
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6.4. Convergence results. In this section, we explain how one can deduce from 
the bound on cumulants, some results on the convergence of the random function 
Fa N \ in particular Theorem 1 1.21 

In addition to the bounds above, we need equivalents for the first and second 
joint cumulants of the Fj\x). An easy computation gives: 



ex,A?\ _ N -i + 



N + e-i' 



(N + 6-iy 

n /n ex,N n ex,N, (n - j + Q)(i - 1) 

L>ov(B- ' , B- ' = — — for i < 7, 

v 1 ' 3 ' (N + 9 -1) 2 (N + 6 -2) J ' 



from which we get the limits: 

l-(l-x) 



(19) lim E(#(i)) = / (1 -t)dt + o(l) 
N^oo J 

(20) lim NCov(F^ N \x),F^ N \y)) = / t(l - t)dt 







min(£, it)(l — max(t, u))dtdu. 

0<t<x 
0<u<y 



We call K(x, y) the right-hand side of the second equation. We begin by a proof 

of Theorem 11.21 which describes the asymptotic behavior of F& (x), for fixed 
value(s) of x. 

Proof. Consider the first statement. The convergence in probability of Fa (x) 
towards 1/2 • (1 — (1 — x) 2 ) follows immediately from equations ( fT9l ) and (1201 . 
For the almost-sure convergence, we have to study the fourth centered moment. 

From moment-cumulant formula ([8]) and using the fact that all cumulant but the 
first are invariant by a shift of the variable, 



E 



((f'w - n4 N Hx))) A ) = ^(4 N) W) + 3( K2 (fW(x))) 2 . 



By proposition 16.31 this quantity is bounded from above by 0(N 2 ) and, in par- 
ticular, 

^E((Ff)(,)-E(FW(,))f)<». 
JV>1 

The end of the proof is classical. First, we inverse the summation and expectation 
symbols (all quantities are nonnegative). As its expectation is finite, the random 
variable 

N>1 

is almost surely finite and hence its general term {(Fa (x) — K(F a N \x)) 4 ) N>1 
tends almost surely to 0. 
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Let us consider the second statement. Proposition 16. 3 1 implies that, for any list 
jl,... ,je of integers in [r], one has 

K (Z^( Xjl ),...,zi N \x Je )) = 0(N-^). 

In particular, for r > 2 the left-hand side tends to 0. As the variables zi N \xi) 
are centered, this implies that (zi N \xi), . . . , zi N \x r )) tends towards a cen- 
tered Gaussian vector. The covariance matrix is the limit of the covariance of the 
Zf\xi), that is (K( Xi , Xj )). □ 

It is also possible to obtain some results concerning the sequence of random 
function (F& )jv>1- In the following statement, we consider convergence in the 
functional space (C([0;1]),|| • ||oo), that is uniform convergence of continuous 
functions. 

Theorem 6.4. Almost surely, the function F„ N ^ converges towards the function 
£^l/2-(l-(l-x) 2 ). 

Moreover, the sequence of random functions (x h-> zi N \x))^>i converges in 
distribution towards the Gaussian process x \— > G(x), whose finite dimension laws 
are Gaussian vectors with covariance matrices given by (K(xi, Xj))\<ij< r . 

Proof. As, for any N > 1 and any a £ Sn, the function x i— > Fa(x) is non- 
decreasing, the first statement follows easily from the convergence at any fixed x. 
The argument can be found for example in a paper of J.F. Marckert [27 , first page], 
but it is so short and simple that we copy it here. By monotonicity of and F, 
for any list = xq < x\ < ■ ■ ■ < = 1, one has 

sup \FW( X )-F{x)\ 

zS[0;l] 

< max max {\F^ N \x j+1 ) - F(xj)\, \F^ N \ X j) - F(x j+1 )\) 

0<j<k 

max \F( Xj ) - F(x j+1 )\, 

0<j<k 

which may be chosen as small as wanted. 

Consider the second statement. If the sequence of random function x \-t Z& {%) 
has a hmit, its finite-dimensional laws are necessarily the limits of the ones of 
that is, by Theorem 1 1.21 Gaussian vectors with covariance matrices given by 
(K( X i, Xj))i<ij< r . As a probability measure on C([0; 1]) is entirely determined 
by its finite dimensional laws (TJ Example 1.2], one just has to prove that the se- 
quence x h-> Zfr* (x) has indeed a limit. To do this, it is enough to prove that it is 
tight [7 , Section 5, Theorems 5.1 and 7.1], that is, for each e > there exists some 
constant M such that: 

for all N > 0, one has Prob (WZ^W^ > M) < e. 

Once again, this follows from a careful analysis of the fourth moment. 
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Let N > 1 and s / s' in [0; 1] such that Ns and Ns' are integers. Using 
equation ([8]) and the fact that Z^ N \s) and zi N \s') are centered, one has: 

E((zW( S )-Zf)( S ')) 4 ) 

= - zW(s')) + 3« 2 (ZW( S ) - Z^(s')) 2 

= N 2 [k a {F^{s) - Ff)( S ')) + 3K 2 (FW( S ) - Ff(,')) 2 ). 
A simple adaptation of the proof of Proposition 16.31 shows that 

Ki(FW( 8 ) - F( N \s')) < C t N- t+1 \a - s'\. 

Indeed, in Lemma l5Tl if we ask that at least one entry of the list i is between Ns 
and Ns' then the number of lists is bounded from above by C^iV*|s — s'\. Finally, 

E ({ZW{s) - Z^V)) 4 ) < (N 2 (C 4 N~ 3 \s - s'\ + 3ClN- 2 \s - s'\ 2 )) 

< (C 4 + 3C|)|s - s'\ 2 . 

The last inequality has been deduced from \s — s'\ > N . 

We can now apply Th. 10.2 of Billingsley's book Q with Si = zf\i/N) (for 
< i < N), a = /3 = 1 and u t = (C 4 + ZClY^/N (see equation (10.11) of the 
same book). We get that there exists some constant K such that 

Prob ( max ISA > M) < KM' 4 , 

x 0<i<N ' 

which proves that the sequence zi N) is tight. □ 

7. Generalized patterns 

This section is devoted to the applications of our method to adjacencies (subsec- 
tion [L2]) and dashed patterns (subsection 17 .3 1 ). These two statistics belong in fact 
to the same general framework and we discuss in subsection I7.4l the possibility of 
unifying our results. 

The proofs in this subsection are a little bit more technical than the ones before 
and in particular we need a new lemma for step 3, given in subsection 17.11 

7.1. Preliminaries. Let L > 1 be an integer. For each pair {j, k} C [L], we 
choose a. finite set of integers Dij^y. 

Consider a list i\,...,iL of integers. For each pair e = { j, k} C [L] (with 
j < k), we denote 5 e (i) the difference ik — ij- Then we associate to i a graph of 
vertex set [L] and edge set {e : 5 e (i) G D e }. 

The following lemma is a slight generalization of Lemma [5TTI 

Lemma 7.1. For each L and families of sets (-D{j i fc})i<j<fc<L' there exists a con- 
stant C'l jj with the following property. For any N > 1 and t < L, the number of 
sequences i\, . . . , %l with entries in [N], whose corresponding graph has exactly t 
connected components is bounded from above by C'[ D N l . 
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Proof. If we fix a graph G with vertex set L and t connected components and 
if we fix also, for each edge e of the graph, the actual value of <5 e (i), then the 
corresponding number of lists i is smaller than N l . Indeed, the sequence will be 
determined by the choice of one value per connected component of G (with some 
constraints, such that no extra edges appear). But the number of graphs and of 
values on edges are finite (the sets Dj k are finite) and depend only on L and on the 
family D. □ 

7.2. Adjacencies. In this section, we prove the following extension of Theorem 

Theorem 7.2. Let o~n be a sequence of random permutations, such that o~n has 
size N and is distributed with respect to Ewens measure of parameter 9. Then the 
number A^ N "> of adjacencies in o~n converges in distribution towards a Poisson 
variable of parameter 2. 



Proof. As before, we write A^' in terms of the B^ N J (we use the convention 

n [N]): 



= if i or s is not in [N]): 



l<i,s<N 

E = ±l 



Hence, for £ > 1, its £-th cumulant writes as (step 1): 

(21) ^W) = Y. • • • . ■ 

l<i>L,s^,...,ia,sg<N ^ ' 
e 1 ,...,e e =±l 

Given two lists i and s of positive integers, we consider the three following graph: 

• Hi has vertex set [£] and has an edge between j and k if \ij — < 2 and 

\sj - s k \ < 2; 

• H2 has vertex set [£] and has an edge between j and k if 

± l,Sj,Sj ± 1} n {ik,ik ± l 5 Sfe,Sfc ± 1} / 0. 

• has vertex set [I] U [£] and has an edge between j and k (resp. j and k, 

j and A;) if \ij — i k \ < 2 (resp. |ij — s^l < 2, |sj — s^l < 2) 

We will use Theorem 1 1.4 1 to give a bound for 

ii+l,si+£i' ' "i^s^tf+Mi+e^ 

Clearly, the number M(i, s) of different couples in the set 

{{ij, Sj ); l<j<£}U {(ij + 1, Sj + €j); I < j < £} 

is at least equal to 2#(CC(i^i)) > #(CC(i2i)) + 1. Besides, in this case, the 
graph G' 2 introduced in section [131 has the same vertex set as H2 and fewer edges. 
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Hence it has more connected components. Therefore, Theorem 11.41 implies (step 
2): 



k(b (n) (TV) B (N) B iN) ) 



"It 

But, using the terminology of section 14.31 the graphs H\ and H2 are the surcon- 
traction and contraction of #3. Therefore, by Lemma l4~2l one has: 

(22) #(CC(fT 3 )) < #(CC(Fx)) + #(CC(£T 2 )). 

Besides, Lemma [7J] implies the number of lists i and s with entries in [N] such 
that H3 has exactly t connected components is bounded from above by C 2£ D N l 
for D well-chosen (step 3). In particular the constant C 2t D does not depend on 
N. Therefore, the total contribution of these lists to equation (|2TT) is bounded from 
above by C M i\r* • C^JV* = C 2l ■ C' 2 \ D . 

Finally, 

Ke (A {N) ) = 0(l). 

Moreover, only lists such that M(i, s) = 2 and #(CC(i2i)) = 1 contribute to the 
term of order 1. But this implies that the lists i, s and e are constant. In other 
words, 

n e (A^)=J2^B^l +e ) + 0(N^). 

e = ±l 

The 2(JV - l) 2 variables B^pB^} g+e are Bernoulli variables, whose parameters 
are given by: 

• if s = i G [N — 1] and e = 1, then the parameter is (jv+6*-i)(A+e-2) 
(N — 1 cases); 

• if s = i; e = — 1 (here 2 < i < N — 1) or s = i + 1; e = — 1 (here 
1 < i < N - 1) or s = i + 2; e = -1 (here 1 < i < N - 2), then the 
parameter is (jV+ g_ 1) e ( jV+ g, 2 ) (3JV - 5 cases); 

• otherwise, the parameter is (jv + g_ 1 ) 1 (jv + g_ 2 ) ■ 

Recall that the cumulants of a sequence of Bernoulli variables X( N > of parameters 
(Pn)n>i with pn are asymptotically given by ki(X^) = Pn + 0(p 2 N ). 
Hence, 

fofA^) = 2(N - l) 2 - i r + Of^ 1 ) = 2 + OfA^- 1 ). 

Finally, the cumulants of A N converges towards those of a Poisson variable of 
parameter 2, which implies the convergence of in distribution. □ 

7.3. Dashed patterns. Let us recall the definition of dashed patterns in a permu- 
tation, as introduced by E. Babson and E. Steingrimsson lO. 

Definition 7.3. A dashed pattern of size p is the data of a permutation r G S p and 
a subset X of [p — 1]. An occurrence of the dashed pattern (r, X) in a permutation 
a G £V is a list ij < • • • < i p such that: 
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• for any i£l, one has i x+ i = i x + 1. 

• <y(i\), . . . , cr{i p ) is in the same relative order than r(l), . . . , t(j>). 
The number of occurrences of the pattern (r, X) will be denoted ojfQ (a). 

Example 7.4. 0^0 * s tne num ber of inversion, while O^li} * s tne num b er of 
descents. Many classical statistics on permutations can be written as the number of 
occurrences of a given dashed patten or as a linear combination of such statistics, 
see (21 section 2]. 

In this section, we prove Theorem ll.6[ which gives, for any given dashed pattern 
(r, X), the asymptotic behavior of the sequence (0^v)jv>i of random variables. 



Proof. As in the previous examples, we write the quantity we want to study in 
terms of the variables B^ . Here, 

n (N) sr- sr^ r (jv) R (iV) 

U t,X — n h,si ■ ■ ■ n i P ,s p - 

«l<---<ip si,...,s p 
farall ieJ,i 1+1 =ii,+l 3 r- 1 (l) < "' <S T- 1 (p) 

Expanding its cumulants by multilinearity, we get (step 1) 

WuvrinW R w b w . . . B {N) 

■3p ^l'^l L pl°p 



(23) M0S) = EEfi^''C''' Z? .'"' D - 

(ij) (-J) V 1 1 

The first (resp. second) summation index is the set of matrices (ij) (resp. (sj)) 
with (j, r) G [p] x [£] such that: 

• for all r, i\ < ■ ■ ■ < i r p (resp. < • • • < 4-i (p) ); 

• for all r, for all x *E X, i r x+1 = i r x + l (resp. no extra condition on the s's). 
Given such lists i and s, we consider the four following graphs: 

Hi has vertex set [p] x [I] and has an edge between (j, r) and (fc, t) if 



|ij - 41 — 1 s j = s l> 
H2 has vertex set [p] x [£] and has an edge between (J, r) and (k, t) if 

{fj, fj + 1, 4} n {4,4 + i,4}/0. 

has vertex set ([p] x [/]) U ([p] x [£]) and has an edge between (j, r) 
and (k,t) (resp. (j, r) and (fc,t); (j, r) and if \i T j — i\\ < 1 (resp. 

a* - V. = or 1; = 4). 

has vertex set [£] and has an edge between r and t if 



U {^ + 1,4} n (J {4,4 + 1,4} U0- 

v i<i<p / \i<fc<p / 

The graphs H\ and H2 are respectively the surcontraction and contraction of #3, 
as defined in Section|4] Therefore, one has, by Lemma |4~2t 

#(CC(F 3 )) < #(CC(#!)) + #(CC(fT 2 )). 
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But one can further contract H2 by the map / : [p] x [£] — > [£] defined by f(j, r) = 
r and we obtain H' 2 . With the notation of Section 01 it implies: 

1 

#(CC(tf 2 )) < #(CC(^)) + J2 [#(CC {H 2 [[p] x {r}])) - 1] . 

r=l 

But each induced graph ./^[[p] x {■"}] (for 1 < r < £) contains at least an edge 
between (x, r) and (x + 1, r) for each x G X (because we assumed i r x , 1 =i r x + 1). 
Thus it has at most p — q connected components. Finally, 

(24) #(CC(H 3 )) < #(CC(Hi)) + #(CC(fl£)) + (p - g - 1)1 

Let us apply the main lemma (Theorem 1 1.41 ) to obtain a bound for 

J B (N) B {N) B {N) B {N) 

\ i l,S 1 tp^p V £ 

In this case, the number of different couples in the indices of the Bernoulli variables 
is at least the number of connected components of H\. Besides, the graph G' 2 
introduced in section [T31 has the same vertex set, but fewer edges than H' 2 . Hence, 
it has more connected components and we obtain: 



K [B^...B^,...,B^...B^ 



< C p iN- 



-#(CC(tfi))-#(CC(tf£))+l 



Using inequality above, this can be rewritten as (step 2) 

< c K Ar-#(cc(H 3 ))+(p-^iK+i > 



K [ \ ■ ■ ■ B-i 1,..., B. t t . . . fi£ \ 



As in the previous section, Lemma 17711 asserts that the number of couples of lists 
((ip, (sp) such that #(CC(# 3 )) = t is smaller than C% D N l for a well chosen 
D (step 3). Hence their total contribution to Equation (1723T ) is bounded from above 
by the quantity C pe C'^ D N^- 1 ^ +1 . Finally, one has: 

(25) Ke (0$ T) ) = 0(N<r-''- 1 » +1 ), 

or equivalently K((zi^\) = 0(N~ e / 2+1 ). As in section [6741 the theorem follows 
from this bound and from the limits of the normalized expectation and variance. 

For the expectation, we have to consider the case £ = 1. In this case, one has 
#(CC(J3i)) = p and #(CC(fl£)) = 1. Therefore, if we want an equality in 
Equation (17241) . we need #(CC(Hz)) = 2p — q, which implies that all entries in the 
lists i and s are distinct. For these lists, one has (Lemma |3~TT) 

v *\A 1 1' s p' y *\A l i>4 J (N + e-l) p 



asymptotically ^z^yy- Finally, 



But the number of lists with distinct entries in the index set of equation (12731 is 
-. Finally, 

lim — — E(0[y ) ,)= 1 
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It remains to prove that the renormalized variance 
limit V Tj x > 0, when N tends to infinity. But this follows from the bound (l25l) and 
the fact that any Kt(o9p T )) * s a rational function in N. Let us explain the latter 
fact. 

Recall that (0^\ ) is given by equation (1231) . We can spht the sum depending 
on the graph H 3 associated to the matrices i and s and on the actual value S e (i, s) 
of ij — i\ (or s\ — ij and s| — respectively) for each edge eof H3. Then the fact 

that Ki(OyP T \ ) is a rational function is an immediate consequence of the following 
points: 

• the number of graphs H% and possible values for the differences <5 e (i,s) 
(for e 6 -£7# 3 ) are finite; 

• the cumulant k(B:^\ . . . B^\ , . . . , B^\ . . . B.f t ) is a rational func- 
tion in iV which depends only on the graph H3 and values of <5 e (i, s) (for 

• the number of matrices i and s corresponding to a given graph G and given 
values <5 e (i, s) is a polynomial in N. □ 

7.4. Generalized patterns and local statistics. The notion of dashed patterns has 
been recently generalized by several authors in iTTOl Section 2]. The idea is roughly 
that, in an occurrence of a generalized pattern, one can ask that some values are 
consecutive (and not only some places as in dashed patterns). It would be in- 
teresting to give a general theorem on the asymptotic behavior of the number of 
occurrences of a given generalized pattern. This seems to be a hard problem as 
many different behavior can occur: 

• The number of adjacencies is the sum of the number of occurrences of two 
different generalized patterns and converge towards a Poisson distribution. 

• The dashed patterns are special cases of generalized patterns. As we have 
seen in the previous section, their number of occurrences converges, after 
normalization, towards a Gaussian law. Other generalized patterns exhibit 
the same behavior, for example the one considered in iTTOl (the proof is 
the same as for dashed patterns; note that Remark ?? does not hold for 
occurrences of this pattern). 

• Other behaviors can occur: for example, it is easy to see that the number of 
occurrences of the pattern (123, {1}, {1}) (we use the notations of [10]), 
has an expectation of order n, but a probability of being with a positive 
lower bound. 

Even if we have not been able to give a general statement, our approach unifies the 
first two cases. 

The notion of generalized patterns can be further extended to the one of local 
statistic. Fix a integer p > 1 and a set S of constraints: a constraint is an equality 
or inequality (large or strict) whose members are of the form ij + d or Sj ■ + d where 
j belongs to [p] and d is some integer. Then, for a permutation a of SV, we define 

Op 5 (<t) as the number of lists ix,...,i p and s±, . . . ,s p satisfying the constraints 
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in S and such that = Sj for all j in [p\. For instance, the number of d-descents 
studied in [ 8 ] is a local statistic. 

We call any linear combination of statistics O^J a local statistic. The number 
of occurrences of a generalized patterns, but also the number of excedances or of 
cycles of a given length p, are examples of local statistics. The method presented 
in this article is suitable for the asymptotic study of joint vectors of local statistics. 
We have failed to find a general statement, but we are convinced that our approach 
can be adapted to many more examples than the ones studied in this article. 

However, the method does not seem appropriate to global statistics, such as the 
total number of cycles of the permutation or the length of the longest cycle. 
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